On Constraints on the Search Path of Policy Iteration
نویسنده
چکیده
We describe a few structural properties enjoyed by the policy space of problems such as in nite-horizon MDPs. From these properties we derive constraints limiting the number of iterations of algorithms such as the policy iteration algorithm for in nite-horizon MDPs and the Ho man-Karp algorithm for simple stochastic games. An open problem is to characterize the growth of the worst-case number of iterations of these algorithms subject to the derived constraints.
منابع مشابه
Corrector-predictor arc-search interior-point algorithm for $P_*(kappa)$-LCP acting in a wide neighborhood of the central path
In this paper, we propose an arc-search corrector-predictor interior-point method for solving $P_*(kappa)$-linear complementarity problems. The proposed algorithm searches the optimizers along an ellipse that is an approximation of the central path. The algorithm generates a sequence of iterates in the wide neighborhood of central path introduced by Ai and Zhang. The algorithm does not de...
متن کاملA path-following infeasible interior-point algorithm for semidefinite programming
We present a new algorithm obtained by changing the search directions in the algorithm given in [8]. This algorithm is based on a new technique for finding the search direction and the strategy of the central path. At each iteration, we use only the full Nesterov-Todd (NT)step. Moreover, we obtain the currently best known iteration bound for the infeasible interior-point algorithms with full NT...
متن کاملThe optimal search for a Markovian target when the search path is constrained: the infinite-horizon case
A target moves among a finite number of cells according to a discrete-time homogeneous Markov chain. The searcher is subject to constraints on the search path, i.e., the cells available for search in the current epoch is a function of the cell searched in the previous epoch. The aim is to identify a search policy that maximizes the infinite-horizon total expected reward earned. We show the foll...
متن کاملProviding an algorithm for solving general optimization problems based on Domino theory
Optimization is a very important process in engineering. Engineers can create better production only if they make use of optimization tools in reduction of its costs including consumption time. Many of the engineering real-word problems are of course non-solvable mathematically (by mathematical programming solvers). Therefore, meta-heuristic optimization algorithms are needed to solve these pro...
متن کاملAn Efficient Method for Selecting a Reliable Path under Uncertainty Conditions
In a network that has the potential to block some paths, choosing a reliable path, so that its survival probability is high, is an important and practical issue. The importance of this issue is very considerable in critical situations such as natural disasters, floods and earthquakes. In the case of the reliable path, survival or blocking of each arc on a network in critical situations is an un...
متن کامل